IIT at TREC 2003, Task Classification and Document Structrure for Known-Item Search
نویسندگان
چکیده
This year’s TREC 2003 web task incorporated two retrieval tasks into a single set of experiments for Known-Item retrieval. We hypothesized that not all retrieval tasks should use the same retrieval approach when a single search entry point is used. We applied task classifiers on top of traditional web retrieval approaches. Our traditional retrieval is based on fusion of result sets generated by query runs over independent parts of the document structure. Our task classifiers combine query term analysis with known information resources and URL depth. This approach to task classification shows promise: our classified runs improved overall MRR effectiveness over our traditional retrieval results by ~10%; provided an MRR of .665; ranked 87% of relevant results in the top 10; correctly ranked the #1result 56% of the time. 67% of the queries performed above the average, and 49% above the median.
منابع مشابه
Enterprise, QA, Robust and Terabyte Experiments with Hummingbird SearchServer at TREC 2005
Hummingbird participated in 6 tasks of TREC 2005: the email known-item search task of the Enterprise Track, the document ranking task of the Question Answering Track, the ad hoc topic relevance task of the Robust Retrieval Track, and the adhoc, efficiency and named page finding tasks of the Terabyte Track. In the email known-item task, SearchServer found the desired message in the first 10 rows...
متن کاملIIT at TREC 2002 Linear Combinations Based on Document Structure and Varied Stemming for Arabic Retrieval
For TREC 10 we participated in the Named Page Finding Task and the Cross-Lingual Task. In the web track, we explored the use of linear combinations of term collections based on document structure. Our goal was to examine the effects of different term collection statistics based on document structure in respect to known item retrieval. We parsed documents into structural components and built spe...
متن کاملIIT TREC 2006: Genomics Track
For the TREC-2006 Genomics Track, we report on the effectiveness of composite information retrieval functions based on a dimensional data model for improving document, passage, and aspect search precision of genomics literature. We designed an approach, and developed a corresponding search engine, based on a novel dimensional data model capable of document, paragraph, sentence, and passage leve...
متن کاملIIT TREC 2005: Genomics Track
For the TREC-2005 Genomics Track ad-hoc retrieval task, we report on the development of a scalable information retrieval engine based on a relational data model for the integration of structured data and text. Our objectives are to meet the need for the integrated search of heterogeneous data sets of biomedical literature and structured data found in biological databases, and to demonstrate the...
متن کاملTREC 14 Enterprise Track at CSIRO and ANU
By the time of submission deadline, we completed two tasks: known-item search and discussion search. For both tasks, we used the PADRE retrieval system [1], in which the Okapi BM25 relevance function was implemented. Each message in the collection was treated as an independent document, so both topic distillation scoring and same site suppression mechanism were turned off (i.e. -nocool and –SSS...
متن کامل